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Abstract 

Asymptotic behaviour of conditional a diversity for the two-parameter Poisson-Dirichlet 
partition model and for the normalized generalized Gamma model has been recently in- 
vestigated in Favaro et al. (2009, 2011) with a view to possible applications in Bayesian 
treatment of species richness estimation. Here we generalize those results to the larger 
class of mixed Poisson-Kingman species sampling models driven by the stable subordina- 
tor (Pitman, 2003). 

1 Introduction 

The concept of a-diversity for infinite exchangeable random partitions IT of the positive in- 
tegers, induced by randomly sampling from an almost surely discrete probability measure P, 
was first introduced in Pitman (2003, cfr. Sect. 6.1 Prop. 13) as the random variable S, with 
< S < oo, such that, almost surely for n — > oo, 

n a 

for K n the number of blocks in a random partition of II n of [n] induced by II and a 6 (0, 1). 
For infinite random partitions induced by a r.p.m P whose ranked atoms follow a Poisson- 
Kin gman distribution P K (^p a ^^f^ driven by the Levy density of the stable subordinator p a — 
aF(l — a)~ 1 x~ a ~ 1 dx, for < a < 1, with 7 on (0, 00) some mixing density, Pitman shows that 
S = T~ a where T = S^ 1 ^ is the random total sum of the ranked atoms of P and has density 
7. (See Pitman, 2006, for a comprehensive reference on exchangeable random partitions). 
Recently interest in conditional a -diversity has emerged in posterior species richness estimation 
in a Bayesian nonparametric approach to species sampling problems (cfr. Lijoi et ah, 2007, 
2008; Cerquetti, 2009). Given K n = k the number of species in a partition (rti, . . . , n^) induced 
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by a basic sample of observed species, conditional a-diversity is defined as the random variable 
Sa ,k such that, almost surely for m — >■ oo 

(K n = k)^ S n a > k 

for K m the unknown number of new species induced by an additional sample of observations 
of dimension m. 

In particular Favaro et al. (2009, 2011) derive distributional results respectively for the 
conditional a diversity under two-parameter (a, 6) Poisson-Dirichlet priors (Pitman and Yor, 
1997), which are well known to correspond to r.p.m.s whose ranked atoms follow a PK(p a , j a fi) 

distribution for ^ a fi{t) = r (e/a+ 1 fa{p) f° r < a < 1 and 6 > —a, and under normalized 
generalized Gamma priors, whose ranked atoms follow a Poisson-Kingman law with mixing 
density the exponentially tilted version of the a-stable density (cfr. Cerquetti, 2007; Lijoi et 
al. 2007a). In Cerquetti (2011) a first alternative derivation for the conditional a diversity of 
the two-parameter Poisson-Dirichlet family has been obtained by a decomposition approach 
properly exploiting a characterization of those models in terms of the deletion of classes 
property (cfr. Pitman, 2003; Gnedin et al. (2009)). Here we obtain the general result for 
the entire class of mixed Poisson-Kingman models driven by the stable subordinator, for an 
arbitrary mixing density 7 on (0, 00) written as, without loss of generality, ^(t) = h(t)f a (t) 
for some non negative function h such that 7 is a proper density and f a the density of the 
stable subordinator. This class includes infinitely many priors, and as from Gnedin and 
Pitman (2006), induces exchangeable random partitions in Gibbs form of type a i.e. uniquely 
characterized by exchangeable partition probability function (EPPF) in the following Gibbs 
product form 

k 

p{n 1 ,...,n k ) = V njk Y[(l-a)n j , (1) 

i=l 

where (a)& = a(a + 1) • • • (a + b — 1) are rising factorials and the V n> k are weights satisying the 
backward recursive relation V n ^ = (n — ka)V n+ i jk + V n+ \^+i- 

2 Main results 

The basic result in Gnedin and Pitman (2006, cfr Th. 12) states that the EPPF of each 
exchangeable Gibbs partition of type a, for a € (0, 1), arises by mixing the EPPF correspond- 
ing to the Poisson-Kingman PK(p a \T = t) model with the density 7 identifying the specific 
PK(p a , r f) family, hence 

/>oo 

p a ,j(ni,...,n k )= / p a {ni,...,n k \t)j(t)dt. 
Jo 

The explicit formula for p a (ni, . . . ,n k \t) was first obtained in Pitman (2003, cfr. eq. (66)) 
and is central to our main result. 

Theorem 1. Let II be a PK(p a ,j) partition of N for some < a < 1 and some mixing 
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probability distribution 7 on (0, 00). Without loss of generality assume j(t) = h(t)f a (t). Fix 
n > 1 and a partition (rex, . . . , n&) of [n] with A; positive box-sizes, then II has conditional 
a-diversity with density 

f h,a, , = Ks- l/a )~9 a n , k {s) 

for 

the density of the product of independent r.v.s Y a ,k x [W] a , where Y a ,k has density 

. . F(ka + 1) k . . 
g a M\y) = Y(k + 1) V 9a ^ V ' ^ ' 

for 5 a (y) = a -1 ?/ -1-1 /"/^^ 1 /"), and PF ~ {3(ka,n — ka). 

Proof: By the unconditional result in Pitman (2003) recalled in the Introduction, the a- 
diversity for a general 7 (i) = h(t)f a (t) mixing density, is the r.v. S a> h with density 

7 ( S - 1 /«) = h{s- 1 / a )f a { S - lla )a- 1 s- 1 / a - 1 . (5) 

As from Pitman (2003, cfr. Prop. 9), given S a = s, the exchangeable partition probability 
function for a PK{p a \s) model corresponds to 



-l/a\ a 



p Q (ni, . . . ,n k \s /a ) 

I (n — ka) 
hence by Bayes' rule 

/5 Q , 7 ( s l n l> • • • > n fc) 



Jo fi 



5a)7 ( m ,...,r tfe | S - 1 /«) 7 ( s -i/«) 



/(TPa^K. • • ■ ,n fc |s-V«) 7 ( S -V«)d S 
which simplifies to 

h ( 8 -l/a) 8 k-l/a-l f^p n - 1 - ka f a ((l -p)s~ l / a )dp 



fs a M K n = k) 



Hs-y^s^ 1 /"- 1 ^ p n -^ ka f a ((l -p)s-V a )dp]ds 



and the result is proved. Notice that by definition of mixed PK(p a , h x f a ) model, the general 
weights V n) k,h in the EPPF ([1]) arise as follows 

n k— 1 poo pi 

V n ,k, h = - r r- / Hs- 1 ^- 1 ^- 1 / p n - 1 - ka fa((l-p)s~ 1/a )d P ds 



r(n — fca) Jo 

hence the normalizing constant in formula ([2]) may be obtained through the following rela- 
tionship (see also Ho et al. 2008, eq. (12)) 

KA h ( s /a )] = v n ,k,h r(fc | (6) 
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Remark 2. The result in Theorem 1. agrees with an analogous result for the conditional 
distribution T\K n = k (where T is the random total sum of the ranked atoms of the r..p.m. 
P) first derived in an unpublished manuscript by Ho et al. (2008, cfr. Eq. (13)), that we 
received by one of those authors as a personal communication. Their result relies on the r.v. 

a >( n > fc ) (3(ka, n — ka) 
for S a! ka the polynomially tilted stable random variable with density 

fs a ,kct(t) = p7^ + 1) ^ fot(t)- 

It is an easy task to show that [R a ,(n,k)]~ a = [^a.k x /3(ka,n — ka) a ]. 

In what follows we show how some distributional results for the conditional a diversity of 
specific Poisson-Kingman models already obtained in the literature may be easily derived by 
the general formula in Theorem 1. 



Example 3. [Two-parameter Poisson-Dirichlet (a, 6) partition models] A first result for the 
conditional a diversity for the two-parameter Poisson-Dirichlet model (Pitman & Yor, 1997) 
has been derived in Favaro et al. (2009) in view of Bayesian nonparametric posterior interval 
estimation for the number of new species in an additional sample in species sampling prob- 
lems. Those authors rely on mimicking the original proof for the unconditional a diversity 
in Pitman (2006, Th. 3.8). To apply the general result it is enough to notice that the two- 
parameter Poisson-Dirichlet (a, 6) for 6 > —a partition model is well-known to correspond 
(cfr. Pitman, 2003) to a mixed Poisson-Kingman model driven by the stable subordinator 
with mixing density 

la,e(t) = Kt) x f a (t) = ^e/a + lf eUt) 
with the weights in the Gibbs representation of the EPPF corresponding to 



V?Z = a k - 



.! g(g/a+j)r(g + l) 
' n ' k ~" r(0 + n)r(0/a + l)" 

Hence, by ([6|), the denominator in ([2]) corresponds to 



¥ a (h(z -v«\)-F a (7 e ' a \ - r (g/« + W + i) r(n) 

E n , k (h(Z ))-E n)fe (Z )- T{e + n)r{9/a + 1)m 



and by Theorem 1. the conditional a diversity Z n ' k has density 



T(6 + n) 



g^ r (,,- t I)r ( ;, t ) '' W '''' , °< /.K--)-'°](i (7) 







which corresponds to the 6 /a polynomial tilting of <?" fc . It is an easy task to verify that this 
is the density of the product of independent r.v.s = Y a ^/ a+k x [{3(6 + ka, n — ka] a , as 
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already established in Cerquetti (2011) moving from a decomposition approach exploiting the 
deletion of classes property of this model (see Gnedin et al., 2009). In the next Proposition 
we prove the result actually agrees with the original result in Favaro et al. (2009) expressed 
in terms of the alternative scale mixture representation Y a jQ± n y a x f3(9/a + k,n/a — k). No- 
tice that the equivalence in distribution between R a ^ nt k) and S'a.nl^n/a-A:] -1 ^* was already 
stated without proof in Ho et al. (2008, cfr. Prop. 2.1, Eq. (11)). 



Proposition 4. Let H = Y\ x X for Y\ and X independent r.v.s, Y\ ~ g a ,(e+n) an d 
X ~ ft(9/a+k,n/a — k), then the r.v. Z®', with density ([7J) and H have the same characteristic 
function 

r \ \ a ) r (9 + n) ra 



Proof: First notice that by Proposition 2. in Cerquetti (2011), for m — > oo, 

> + ka\ T(9 + n) 



E, 



a, 8 



K r 



m' 



K n = k 



a ) r T(9 + n + rat) 

By the change of variable zw~~ a = s ([7]) may be written as 

a ,9 ( s = r(9 + n)T(e + ka + l) 1 o/q+fc-x 

/n,fcV ; r(0 + to)r(n- A;a)r((0 + £;«)/« + 1) a 

a -l r l/a-l^( s -l/a) h-( z / s )V«j ds. 

Its characteristic function is given by 

e = r(e + n)r(e + ka + i) 1 

n > fcl J r(0 + fca)r(n - fca)r((0 + ka)/a + l)a 
x / exp{itz}z^ a+k - 1 / 5a (s) (l - (z/s) 1/Q J dsdz 

«/ J z 

and may be rewritten as 

e _T(9 + ka + l)l [°° 
G ^ t) = r((9 + ka)/a + l) aJ z fe(s)>< 

^ l ; r(0 + /ca)r(n-£;a) V v ' ' J 

By the change of variable (z/s) l l a = y, z = y a s, dz = say a ~ 1 dy yields 

T(9 + ka + l) 1 



r((0 + ka)/a + 1) a J 



5a(s)x 



/ 

Jo 



jty«s r(6> + n) g/^fc,! _ )n - ta -l a-l dds 

T(9 + ka)T{n-kay y ' K y> y y 
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and then reduces to 



nO + ka + l) f°° s e/ a +k ga ( s) f 1 e it y «s r(9 + n) (y) *+fc«-l (1 _ y) n-ka-i dyds 



T((9 + ha) /a + 1) J v ' J T(6 + ka)T(n - ka) 

Exploiting the known characteristic function of Y a for Y ~ Beta(9 + ka, n — ka) we can write 

r(0 + ka + 1 



i v (it) r (9 + ka) ra f°° e/a+k+ 

1)2-, h (f ? +n)ra y 

and by (g]) 



^ (it) r (9 + fca) ra r(fl + fca + 1) T((9 + ka + ra)/a + l) 

fr" ~- (0 + n) ra T((9 + ka)/a + l) T(9 + ka + ra + l) ' ^ ^ 



By the usual properties of Gamma function the last expression corresponds to 

oo 

= E 



(ity T(9 + ka + ra)T(9 + n) (9 + ka)T(9 + ka) T(^ + r ) e+ka + ra 



^(it) r ( 9 + ka \ 1 



r! T(9 + ka)T(9 + n + ra) ^±M°LT(^^) (9 + ka + ra)T(9 + ka + ra) 
which simplifies to 

oo . 

= 

,._„ r! V « J r (P + n) lu 

and the conclusion follows by the result in Proposition 2 in Favaro et al. (2009) proving that 
is the characteristic function of H = Y\ x X. 

Example 5. [Generalized Gamma partition models] As from Pitman (2003), generalized 
Gamma partitions models belong to the Poisson-Kingman family driven by the stable subor- 
dinator PK(p a ,j) for a mixing density 

la,x(t) = exp{V> Q (A) - Xt}f a (t), 

where ^> a (A) = (2X) a is the Laplace exponent of f a (-). By an application of ([5]), after the 
reparametrization A = /3 1 / a /2, the unconditional a diversity for this model (see also Cerquetti 
(2007) and Lijoi et al. (2007b)) is given by 



- 1/a ) = exp U ~ \ (£) 1/a \ Us-'h^s- 1 ^ 1 . 



To obtain the density of the conditional posterior a diversity it is enough to apply formula 
([2j) which provides 

exp{/?-Uf) 1/Q }^( 5 ) 



K, k 



exp <P - \ (§ 



l/a 
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The denominator arises by © and the known expression for the V n< k of PK(p a , r y olt p) models 
as obtained in Eq. (6) in Cerquetti (2007), 



V 



k r°° 



n.k 



X 



n-l 



-(I3 1 / a +2\) c 



Jo {P l/a + 2X) n ~ ka 



dX, 



(10) 



which rewritten in terms of incomplete Gamma functions by the change of variable (/3 1//q + 
2X) a = x ,d\ = (2a)~ l x l / a - 1 dx yields 

K ' i=o K 7 

By equation © 



Km 



n-l 



T{k) ^ 

v ' i=0 



n — 1 



(-iy((3y/*T(k--;(3) 



hence by Theorem 1. the conditional a diversity S^'f for the generalized Gamma model has 
density 

a ,p, _ r(fc)exp(2- 1 (/3/ S ) 1 /")^ fc ( S ) 

The result agrees with Favaro et al. (2011, Th. 1.) due to the equivalence in distribution 
between Y a n / a x (3(k,n/a — k) and Y a ,h x /3(ka,n — ka) which arises specializing for 9 = 
the result in Proposition 4. 
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